Graphics, Vision and Image Processing Journal, ISSN 1687-398X, Volume 16, Issue 3, ICGST LLC, Delaware, USA, Dec. 2016 





www. icgst. com *UVIP 


Region Based Integrated Approach for Image Retrieval 

'TALLURI. SUNIL KUMAR, 2 T.V.RAJINIKANTH 3 B. ESWARA REDDY 

X VNR Vignana Jyothi Institute of Engineering and Technology, Hyderabad, Telangana, India 

2 SNIST, Hyderabad, Telangana India. 

3 Professor in CSE and Principal, JNTU-A College of Engineering, Kalikiri, Chittoor Dist, Andhra. Pradesh, India. 
Email: 1 sunilkumartl973@gmail.com, 2 raj initv@gmail.com, 3 eswarcsejntua@gmail.com, 


Abstract 

This paper proposes an integrated method for efficient 
content based image retrieval using color, shape and 
rotational invariant texture features. The present paper 
derived rotational invariant features on each region. To 
derive shape features textons are computed. To represent 
texture features gray level co-occurrence matrix 
(GLCM) features are derived on region based rotational 
invariant texton matrix. These features are combined 
with HSV histograms. The advantage of region based 
models is they are more applicable when working with 
images of large size and especially in real time 
environment. The image retrieval is performed on five 
categories of Wang database and the present method is 
compared with texton co-occurrence matrix (TCM), 
color correlogram gradient (CCG) and GLCM methods. 

Keywords: GLCM, HSV; shape; texture; rotation 
invariant features; 

1. Introduction 

These days there is a huge expansion on browsing of the 
digital libraries or databases. Searching and retrieving 
images from these libraries has become a crucial and 
tedious task for human annotation and this has created 
the dire need of content based image retrieval (CBIR) 
methods. The CBIR methods are capable of retrieving 
the desired images from these libraries based on the 
image contents. The CBIR models makes use of visual 
contents of an image like color, shape, texture mosaic, 
faces and spatial layouts for efficient image retrieval 
(IR). It is highly impossible to represent an image with a 
single best feature and it is due to the fact that user may 
capture photographs from different angles, lighting 
conditions, reflection etc. The traditional image retrieval 
(IR) methods are text based methods. The images are 
retrieved by matching the corresponding index text or 
meta-data associated with images. A comprehensive 
literature survey on CBIR is presented in [1-4]. 

The color content of an image is one of the powerful 
descriptor of CBIR and it can keep semantically intact 


and it is robust to noise, change in size, image 
degradation and orientation. There are various CBIR 
systems that are based on color descriptors [5, 6, 7, 8]. 
The retrieval performance of these degrades on huge 
databases due to color shading problems. One of the 
most visual characteristic feature of the image is the 
texture and texture features plays an important and 
crucial role in many applications like image 
classification [ 9, 10, 11], face recognition [12, 13], 
smoke detection [14], age and facial expressions 
identification [15, 16], pedestrian detection [17, 18] and 
image retrieval[ 19, 20, 21, 22, 23]. Various methods are 
proposed for extracting texture features such as co- 
occurrence matrices [24], local binary patterns [25, 26], 
textons [27] and pattern based methods [28, 29]. These 
methods can be roughly classified into statistical, 
structural and model based method. Most of the pattern 
based methods attempted to retrieve the desired images 
based on the frequencies of each pattern in the image and 
treated them as feature descriptor using histograms. The 
frequency gives information regarding the number of 
times these patterns appeared in the image and it doesn’t 
not reveal any information regarding the mutual 
occurrence of patterns in the image. This is addressed by 
the present paper by making use of textons. 

The IR based on texture descriptors such as Gabor 
transforms [30], rotated wavelet filters [31] are proposed 
in the literature. The other CBIR models are based on 
relevance feedback techniques [32], robust local patterns 
[33], temporal patterns of video sequences [34] and the 
combination of relevance feedback with region based 
features [35]. Recently various pattern based features i.e. 
local maximum edge patterns [36], local tetra patterns 
[37] for natural IR are proposed. The pattern based 
features are also proposed for retrieving of medical 
images i.e. directional binary wavelet pattern [38], local 
mesh patterns [39] and local ternary co-occurrence 
patterns [40]. The block based methods using LBP 
texture descriptors are proposed by Takalo et al. [41] for 
CBIR. The present paper divides the image into multi 
regions and evaluates the features on each region. This 
provides the detailed relative location similarity and 
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reduces the computational complexity. The earlier works 
on CBIR treated the texture and color information as 
individual features. In this work region based rotational 
invariant texture features are integrated with shape and 
color space components for efficient image retrieval. 
The present paper is organized as follows. The second 
section describes the concepts of basic LBP and 
generation of rotational invariant uniform LBP. The 
section three describes the methodology and frame work. 
The section four and five gives the results and 
discussions and conclusions. 

2. Local binary pattern (LBP) 

Ojala et al. [42] introduced a powerful local gray scale 
descriptor called LBP for texture classification. LBP 
utilizes the intensity distribution of local neighborhood 
pixels. The LBP code on a neighborhood is computed by 
comparing the greyscale value of neighboring pixels (g p ) 
with central pixels (g c ) as shown in the Figure 1, based 
on the following equations. 


The LBPg,i operator produces 2 8 different binary patterns 
and this results a total of 256 LBP codes or feature vector 
of length 256. When the image is rotated, the gray level 
values of Pi will correspondingly move along the 
perimeter of the circle around, the central pixel P c . The 
pixel Pi of the neighborhood is mostly assigned the co- 
ordinate position (0, 0) as shown in Figure 2. Rotating a 
particular binary pattern on the perimeter naturally 
results different LBPg codes. This does not apply to the 
constant binary pattern i.e. contains all zeros or all ones 
(00000000 or 1111111 l).To overcome this rotation 
effect and to make the local binary pattern as rotation 
invariant a unique identifier is denoted by obtaining the 
minimum or maximum value by rotating as given in 
equation 3 and 4. 


LBPo 


0 , 1 ... 7 } 


( 3 ) 


LBP ? 


LBP , 


P,R 


s (^Bp 9c) 


>U) = { 


( 1 ) 

( 2 ) 


1 if x >0 
0 otherwise 

Where P is the number of neighboring pixels and R is the 
radius of the neighborhood. A 3x3 neighborhood will 
have P=8 and R=l. The co-ordinates of the 
neighborhood pixels are computed as (RCos(27cP/P, - 
RSin(27iP/P) and their grey levels are estimated by 
interpolation. 


= min{ROR(LBP 8 ,i ) | i 
or 

= max{ROR(LBP 8 , Q\i = 0,1 ... 7} (4) 

Where ROR(z,i) performs a circular bitwise right shift 
on the 8 -bit binary number z, i times. The min(x) or 
max(x) takes out the minimum or maximum LBP code 
from these 8- circular shifts. This becomes the rotation 
invariant LBP (LBP n ). 
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Figure 2: The basic co-ordinate system of a LBP window. 


Table 1: ULBP ri values and indexes on LBP 8 ,r. 
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Figure 1: LBP code generation. 
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2.1 Derivation of Rotational Invariant ULBP 
(ULBP ri ) 

LBP with P neighboring pixels results into 2 P 
combinations of LBPs. This results a feature vector 
length of 2 P . As the number of neighboring pixels 
increases (16, 1) and (16, 2) the length of feature vector 
increases drastically. The disadvantage of this feature 
vector is its computational cost. To overcome this 
uniform LBP (ULBP) [43, 44] are proposed. The ULBPs 
have limited discontinues i.e. less than or equal to two in 
the circular binary representation and it is proved that 
most of the windows (above 90%) in human faces and 
textures are ULBPs. The remaining patterns where the 
numbers of transitions from 0 to 1 or 1 to 0 are above 
two are considered as non-ULBPs (NULBP). The 
NULBPS are treated as miscellaneous. There will be 
P*(P-1) +3 distinct ULBP on a neighborhood with P 
neighboring pixels. 


Rotational 
invariant 
ULBP on a 3 
x 3 window 
(adjacent Is) 

LBP code 
Value 

according to 
equation 3 

Index 

value 

assigned to 
ULBP ri 

(0000 0001) 

1 

1 

(00000011) 

3 

2 

(00000111) 

7 

3 

(00001111) 

15 

4 

(00011111) 

31 

5 

(00111111) 

63 

6 

(01111111) 

127 

7 

(11111111) 

255 

8 

(00000000) 

0 

9 

All others- 
NULBPS 


0 


There are 36 unique rotation invariant LBPs that occur 
on a 3x3 neighborhood or LBPg,R. It is experimentally 
shown that LBP 8 Jf 6 does not show any good 
discrimination [44]. The performance of these 36 
patterns in discrimination of textures varies greatly 
because some patterns sustain rotation quite well while 
other patterns do not and confuse the analysis. 
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Figure 3: The integrated CBIR model of the present paper 


The varying performance of these LBP n also led to the 
discovery of uniform (U) patterns. A ULBP appears on 
a LBPg,R , whenever there are zero or more (<= 8) 
adjacent ones in any position and the Table 1 summarizes 
the index values that are assigned to ULBP n by the 
present paper. 

3. Methodology 

The present paper proposes a novel frame work for CBIR 
called “multi-region rotational invariant uniform LBP 
texton matrix” (MR-ULBP n -TM) to overcome the 
limitations of LBP, and to capture shape information on 
multi regions. The basic image retrieval model of this 
paper is given in Figure 3. 

The basic LBP operator has the following disadvantages. 
It is designed for a small spatial support area (3x3 
neighborhood); therefore the bit-wise comparison 
between two single pixel values of this neighborhood is 
affected by noise to a great extent. The features 
computed on the basic LBP cannot capture larger scale 
structure (macrostructure) that may have dominant 
features of textures. In this paper the computation on sub 
regions is performed based on average values of sub 
regions, instead of individual pixels. 

3.1. Computation of MR-ULBP ri 

The present paper converts the color image in to HSV 
color space and derives color histograms. The V color 
space of the image is divided into non over-lapped 
regions of size 9x9. Each region is sub divided into nine 


non overlapped sub-regions. The present multi region 
(MR) IR model derives a single value for each 
rectangular sub region. The advantage of the present 
method is it reduces the overall dimension space of the 
derived features. The MR model captures the dominant 
features on a large scale rectangular structure and the sub 
region features are estimated on grey level values of a 
local neighborhood. The steps for computation of MR- 
ULBP n are given below. 

Step one: Replace the each sub region by its average grey 
level value. By this the region of size 9x9 with 9 sub 
regions becomes a 3 x 3 neighborhood, where each pixel 
value represents the average grey level value of that sub 
region. 

Step two: Computation of LBP on each region by 
average operator. The comparison operator between 
single pixels in LBP is simply replaced with comparison 
between average gray-values of sub-regions (threshold). 
This generates a binary pattern. 

Step Three: If the generated multi region-Local binary 
pattern of step two is ULBP then replace the central pixel 
with MR-ULBP n index value as given in table 1. 
Otherwise replace the central pixel with value zero 
(NULBP). 

Note that the scalar values of averages over blocks can 
be computed very efficiently [45] from the summed- area 
table [46] or integral image [47]. For this reason, MR- 
ULBP n feature extraction can also be very fast: it only 
incurs a little more cost than the original 3x3 LBP 
operator. This way, MR-ULBP n code presents several 
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advantages: (1) It is rotational invariant and robust; (2) it 
encodes not only micro structures but also 
macrostructures of image patterns, and hence provides a 
more complete image representation than the basic LBP 
operator; (3) MR-ULBP n can be computed very 
efficiently using integral images. 4) This representation 
is very useful in deriving textons because the image is 
quantized to ten levels (0 to 9); the ULBP n will be given 
indexes from lto 9 and all NULBPs as zero. 

The regions can be small, medium and large i.e. 3 x 3,9 
x 9 and 15x15 neighborhoods respectively. For a small 
scale regions like basic LBP, local, micro patterns of 
textures are well represented, which may beneficial for 
discriminating local details. On the other hand, using 
average values over the large scale regions (15 x 15) 
reduce noise, and makes the representation more robust; 
and large scale information provides complementary 
information to small scale details and much 
discriminative information is also dropped. Normally, 
regions of various scales should be carefully selected and 
then fused to achieve better performance. The present 
paper chose a region of size 9x9 and sub regions of size 
3x3. 


neighboring pixels on a 2 x 2 windows or grid. The 
pixels of the grid are denoted as P, Q, R and S. The five 
types of textons are denoted as Ai, A 2 , A 3 , A 4 and A 5 
(Figure 4). 


p 

Q 

R 
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(a) Ai A2 A3 A4 


A 5 


Figure 4: The textons used in this paper (a) 2x2 window of the image 
A}toA 5 : different textons . 
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3.2. Computation of “Texton Matrix on Multi Region 
Rotational Invariant Uniform LBP (MR-ULBP ri - 
TM)” 

The previous section generates a multi-region based 
ULBP n (MR-ULBP n ) image with ten quantized levels 
or patterns {0 to 9}. The present section evaluates 
textons on this. The LBP and texton based models are 
widely used in many applications [48, 49, 50, 51]. It is 
found that, it is very difficult to obtain satisfactory 
results, of image processing, by designing algorithms 
that process the images based on pixel levels. More over 
this processing system fail in representing the shape 
component totally. To address this Julesz [27] proposed 
the concept of texton’ s. Textons represent the 
relationship between pixels in the form of shape 
component; however defining a texton is still a difficult 
task. Texton is one of the popular and significant shape 
primitives and is defined with certain placement rule. 
The textons represents the emergent and dominant 
patterns on a local neighborhood. 

The image features have a close relationship with textons 
and color diversification. The difference textons may 
form various image features. If the textons in image are 
small and the tonal differences between neighboring 
textons are large, a fine texture may result. If the textons 
are higher and holds quite a few pixels then it results a 
coarse texture and it also depends on scale [49]. In the 
image if the textons are large and contains a small 
number of texton categories, then a shape may result. 
There can be numerous types of textons in image. In this 
paper, we only classify and make use of five special 
types of textons that holds all the three or four 
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Figure 5: computation of MR-ULBP n -TM from MR-ULBP' 1 image, 
(a) :MR-ULBP n image ; (b) .Detection of textons A h A 2 and A 3 on MR- 
ULBP n image (c) Computation of textons A 4 and A 5 on MR-ULBP n 
image; (d)Formation of MR-ULBP 1 -TM (Final Texton image) using 
A i, A 2 , A 3 , A 4 and A 5 _ 

The process of texton identification is shown in Figure 
5. The present paper used the five types of textons to 
detect every grid. A particular texton detection process 
is performed on a 2 x 2 grid in overlapped manner 
(shifting right by one column position then row by one 
position down) and if the texton is detected the pixels of 
texton are kept with original values and others are 
replaced with zeros. The same process is repeated for all 
five defined categories of textons. The MR-ULBP n -TM 
(final texton) image (Figure 5(d) will be formed by 
combining these five types of texton images (Figure 5(b) 
&5 (c). 

3.3 COMPUTATION OF GLCM FEATURES ON 
MR-ULBP ri -TM 

On MR-ULBP n -TM image, the co-occurrence matrix is 
formed with a distance D and with an angle 0 0 ,45°,90 0 
and 135°. The GLCM features i.e. entropy, energy, 
contrast, local homogeneity and correlation (equations 
5. 6, 7, 8 and 9) are computed on MR-ULBP n -TM with 
0 0 ,45°,90 0 and 135° orientations and average feature 
values of these orientation are listed in the feature 
library. In order to extract color information the present 
paper also quantized the original image using HSV color 
space. 

Entropy = S!j=o - ln ( p ij) p ij (5) 


Energy = Zf]=o -ln( p ij) 2 

(6) 

Contrast= X[j=o Py 0 — j) 2 

(7) 

Local Homogenity- £lj=o 1+(i V )2 

(8) 

Correlation = Sfj-o p ij 

(9) 


where Pij is the pixel value in position (i,j) of the texture 
image, N is the number of gray levels in the image, p is 
M — £jj=o iPij mean of the texture image and cr 2 is 
<7 2 = Tii,J = o Pij (i — m) 2 variance of the texture image. 

3.4 Image Retrieval Algorithm 

The proposed image retrieval algorithm is given below 
Input: Query image Output; Retrieval of similar images 

1 . Convert the RGB image into HSV color space. 

2. Divide the v-color space image into non 
overlapped regions of size 9x9. 

3. Divide the region in to sub regions and derive 
feature vector (The region of sixe 9x9 becomes 
3x3). 

4. Derive multi region rotational invariant ULBP 
(MR-ULBP n ) index (as given in table 1) image. 

5. Compute texton matrix on multi-region rotational 
invariant ULBP (MR-ULBP n -TM) by deriving 
textons on each 2x2 grid of step 4. 

6. Derive multi-region rotational invariant ULBP 
texton co-occurrence matrix (MR-ULBP n -TCM) 
with various distances on step 5. 

7. Compute GLCM features on MR-ULBP ri -TCM. 

8. Compute the histograms for H, S and V color 
spaces. 

9. Construct feature vector by concatenating 
histograms for H, S and V color spaces with MR- 
ULBP n -TCM features. 

10. Compare the features of query image with the 
images in the database using similarity 
measurement. 

1 1 . Retrieve the images based on nearest distance or 
best matches. 


3.5 Query Matching 

This is accomplished by measuring the distance between 
the query image and database images. The present paper 
used Euclidean distance as the distance measure and as 
given below 

1 /2 

Dist s (T n ,/ n ) = fu=i\fi(T n ) - fjQn) I 2 ) (10) 


Where T n query image, I n image in database; 

The database image is used as the query image in our 
experiments. If the retrieved image belongs to the same 
category as that of query image we say that the system 
has suitably identified the predictable image otherwise 
the system fail to find the image. 


23 




Graphics, Vision and Image Processing Journal, ISSN 1687-398X, Volume 16, Issue 3, ICGST LLC, Delaware, USA, Dec. 2016 


4. Results and Discussion 

In order to efficiently investigate the performance of the 
present retrieval model, we have considered the Wang 
database [52]. Wang is a subset of Corel stock photo 
database. In the Wang database the images have been 
manually chosen. This data base consists of 5 classes of 
images i.e. Elephants, Fancy Flowers, Horses, Valleys 
and Evening Skies and 100 images per each class. The 
present paper used these 5 classes of images for 
relevance assessment. For a query image the relevant 
images are assumed to be the remaining 99 images of the 
same class. The images from all other classes are treated 
as irrelevant images. The hefty size of each class and the 
heterogeneous image class contents made Wang data 
base as one of the popular database for image retrieval. 
The performance of the present model is evaluated in 
terms of precision and recall rate. Precision is the ratio 
of number of retrieved images (Inr), Vs. the number of 
relevant images retrieved (Irr). The recall is the ratio of 
total number of relevant images in the database (Itr) Vs. 
Irr. 

Precision - P= ( Irr / I N r ) (11) 

Recall -R = ( Irr / Itr) (12) 

The present paper compute GFCM features on MR- 
ULBP n -TCM using various distance values: D= 1,2... 
7 and query matching is performed using Euclidean 
distance. The present retrieval model selects 16 top 
images from the database images that are matching with 
query image. And also experimented with more number 
of top images and retrieval performance is measured. 
Figure 6 shows five examples of retrieval images, i.e. 
one image from each class, by the proposed method with 
D=4 for Inr =16 and top left most image is the query 
image. 


Qcrylrmc 







Figure 6 (b): Retrieved fancy flower images. 


tary Inane 



Figure 6(c): Retrieved horse images. 
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Figure 


6 (a): Retrieved elephant images. 



Figure 6 (d): Retrieved valley images. 
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Figure 6 (e): Retrieved Evening Skies images. 

Fig 6(a) to 6(e) Retrieved images for each class with D=4 
for Inr =16 on proposed integrated method. 

The average precision and recall rates of all classes of 
images are computed based on MR-ULBP n -TCM 
features and color histograms and listed in Tables 2 and 
3. The best performance of MR-ULBP n -TCM with color 
histograms was obtained when D = 4. The retrieval 
performance of the integrated MR-ULBP n -TCM is 
compared with GLCM [53], color correlogram [54] and 
texton Co-occurrence matrix [49]. The present paper 
selected 60 images of the same category or class as query 
images (one by one) and computed precession and recall 
rates by selecting top 16, 25, 35, 45, 55,65,75,85 and 95 
images. The average precession rates of GLCM, CCG 
and TCM are ranging from 38% to 45%, 39% to 46% 
and 60% to 64% respectively for D=4 and for number of 
images retrieved Inr=16 (Table 2 & 3). The average 
precession and recall rates are plotted in graphs ( Figure 
7 and 8 ) by varying Inr. The present paper also computed 
image retrieval accuracy as defined below. 

IR accuracy A = ((precession + recall) 12) (11) 


Table 2: Average precision rate of all classes of images with various 
distance measures for I NR =16. 



Distance parameter 

Methods 

D=1 

D=2 

D=3 

D=4 

D=5 

D=6 

D=7 

GLCM 

0.38 

0.41 

0.42 

0.45 

0.44 

0.43 

0.43 

CCG 

0.39 

0.41 

0.44 

0.46 

0.45 

0.44 

0.43 

TCM 

0.60 

0.61 

0.63 

0.64 

0.63 

0.61 

0.62 

Proposed 

MR- 

ULBP ri - 

TCM 

0.69 

0.71 

0.74 

0.76 

0.75 

0.72 

0.71 


The average IR accuracy graph with varying number of 
matches considered (Inr ) is plotted (Figure 9). The 
proposed integrated MR-ULBP n -TCM achieved best 
performance when compared to the existing three 
methods. 


Table 3: Average precession rate on each class of images for D=4 for 
Inr =16. 



Image category and the precision (%) 

Methods 

Eleph 

ants 

Fane 

y 

Flo 

wers 

Horse 

s 

Valle 

ys 

Eveni 

ng 

Skies 

Averag 

e 

GLCM 

0.39 

0.42 

0.44 

0.48 

0.5 

0.45 

CCG 

0.4 

0.43 

0.46 

0.49 

0.52 

0.46 

TCM 

0.61 

0.6 

0.66 

0.67 

0.7 

0.64 

Propose 

d 

MR- 

ULBP ri - 

TCM 

0.71 

0.72 

0.76 

0.81 

0.82 

0.76 



Figure 7: Average Performance curve (precision) using GLCM, 
CCG, TCM and MR- ULBP n - T CM method with D=4. 



Figure 8.: Average Performance curve (recall) using GLCM, CCG, 
TCM and MR-ULBP ri -TCM method with D=4. 
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Figure 9: Average Performance curve (accuracy) using GLCM, 
CCM, TCM andMR- ULBP n - T CM method with D=4. 


5. Conclusions 

The proposed CBIR model integrated the features from 
texture, shape and color. The present paper derived a 
region based model and evaluated rotational invariant 
features in the form of ULBP n . The proposed model is 
robust and averages can be computed efficiently using 
integral images. The small feature set of multi region can 
make the overall process to be simple and suitable when 
dealing with large size images especially in real time 
environment. The rotational invariant ULBP indexing 
quantizes the image in to 10 levels and these are useful 
in computing texton matrix. The GLCM features derived 
on MR-ULBP n -TCM along with color histograms 
outperformed the earlier methods of image retrieval. The 
proposed method is carried out with varying distances 
and number of retrieved images. The proposed method 
shown high results of retrieval for D=4. 
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